Data Dependant Learners Ensemble Pruning

نویسندگان

  • Gang Zhang
  • Jian Yin
  • Xiaomin He
  • Lianglun Cheng
چکیده

Ensemble learning aims at combining several slightly different learners to construct stronger learner. Ensemble of a well selected subset of learners would outperform than ensemble of all. However, the well studied accuracy / diversity ensemble pruning framework would lead to over fit of training data, which results a target learner of relatively low generalization ability. We propose to ensemble with base learners trained by both labeled and unlabeled data, by adopting data dependant kernel mapping, which has been proved successful in semisupervised learning, to get more generalized base learners. We bootstrap both training data and unlabeled data, namely point cloud, to build slight different data set, then construct data dependant kernel. With such kernels data point can be mapped to different feature space which results effective ensemble. We also proof that ensemble of learners trained by both labeled and unlabeled data is of better generalization ability in the meaning of graph Laplacian. Experiments on UCI data repository show the effectiveness of the proposed method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pareto Ensemble Pruning

Ensemble learning is among the state-of-the-art learning techniques, which trains and combines many base learners. Ensemble pruning removes some of the base learners of an ensemble, and has been shown to be able to further improve the generalization performance. However, the two goals of ensemble pruning, i.e., maximizing the generalization performance and minimizing the number of base learners...

متن کامل

Effect of Pruning and Early Stopping on Performance of a Boosting Ensemble

Generating an architecture for an ensemble of boosting machines involves making a series of design decisions. One design decision is whether to use simple “weak learners” such as decision tree stumps or more complicated weak learners such as large decision trees or neural networks. Another design decision is the training algorithm for the constituent weak learners. Here we concentrate on binary...

متن کامل

Diversity-Based Boosting Algorithm

Boosting is a well known and efficient technique for constructing a classifier ensemble. An ensemble is built incrementally by altering the distribution of training data set and forcing learners to focus on misclassification errors. In this paper, an improvement to Boosting algorithm called DivBoosting algorithm is proposed and studied. Experiments on several data sets are conducted on both Boo...

متن کامل

Multilayer Ensemble Pruning via Novel Multi-sub-swarm Particle Swarm Optimization

Recently, classifier ensemble methods are gaining more and more attention in the machine-learning and data-mining communities. In most cases, the performance of an ensemble is better than a single classifier. Many methods for creating diverse classifiers were developed during the past decade. When these diverse classifiers are generated, it is important to select the proper base classifier to j...

متن کامل

A competitive ensemble pruning approach based on cross-validation technique

Ensemble pruning is crucial for the considerations of both efficiency and predictive accuracy of an ensemble system. This paper proposes a new Competitive measure for Ensemble Pruning based on Cross-Validation technique (CEPCV). Firstly, the data to be learnt by neural computing models are mostly drifting with time and environment, while the proposed CEPCV method can realize on-line ensemble pr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JSW

دوره 7  شماره 

صفحات  -

تاریخ انتشار 2012